Offprint from 


LEXICOSTATISTICS 
IN GENETIC LINGUISTICS 


Proceedings of the Yale Conference 
Yale University, April 3-4, 1971 


Edited by 


ISIDORE DYEN 


1973 
MOUTON 


THE HAGUE - PARIS 


SUB-CLASSIFICATION IN HAMITO-SEMITIC 


HAROLD C. FLEMING 


This paper can be classified as a case study or history of the use of lexicostatistics in 
one African linguistic region. I hope to show three things, namely: 

(a) that a lexicostatistical classification of two classes of Ethiopian languages, initially 
intended to produce glottochronological estimates of language splits in a context of a 
cultural-historical problem, resulted in sub-classifications at variance with standard 
classifications current in the literature; 

(b) that other lines of evidence for genetic sub-classification involving these classes 
eventually supported the principal results of the lexicostatistical classification and 
overturned the standard classification; 

(c) that the inconsistencies in the lexicostatistical sub-classification which initially 
created grave doubts about the usefulness of lexicostatistics as a genetic tool were 
resolved by the discovery of heavy inter-borrowing between the critical languages. 

Throughout the inquiry 1 used the Swadesh 100-item list in an unmodified form. 
Most of the lists were gathered in Ethiopia, Kenya, and Tanzania by myself between 
1957 and 1960. In all but three cases the language of interrogation was Amharic. 
Semantic problems occurred normally in eliciting the following items on the Swadesh 
list: ‘thou’, ‘this’, “that”, “green”, ‘yellow’, ‘round’, ‘to lie down”, and “to walk’. 
The first three are commonly differentiated by gender in Ethiopia, while the latter 
five either are not distinguished from similar concepts (e.g., ‘black’, ‘white’, ‘to sleep’, 
or ‘to go”) or are simply rare in the northeast African semantic area. Initially, judg- 
ments of cognation were made by similarity criteria; later these judgments were 
modified by a growing knowledge of the phonetic correspondences and an awareness 
of probable borrowings. 

The Hamito-Semitic (Afroasiatic) languages of Ethiopia in the received classifica- 
tion have for a long time been divided into two overall genetic classes, Semitic or 
Ethiopic, and Cushitic. The Semitic class usually has been seen as descended en 
masse from Geez, an epigraphic language of northern Ethiopia of the Ist millennium 
A.D. Some scholars sub-divide the daughter languages further into two sub-classes, 
North Ethiopic, and South Ethiopic, but this sub-division has remained highly 
controversial. The Cushitic class was usually sub-divided into four main sub-classes 


86 HAROLD C. FLEMING 


(Northern, Central, Eastern, and Western) to which Joseph Greenberg (1963 :48) 
added a fifth - Southern. A minority and increasingly unpopular sub-classification 
joined the Burji-Sidamo sub-class of Eastern Cushitic to Western Cushitic under the 
term Sidama. 

My lexicostatistical inquiry! into the Ethiopic class showed that (a) a North-South 
division was clearly indicated, but that (b) the Amharic language bridged the gap 
between them, and that (c) northern Ethiopic had considerably more in common with 
Geez than southern Ethiopic did. In addition the so-called Gurage dialect cluster of 
South Ethiopic was not a valid unity; rather some so-called Gurage dialects had more 
cognates with non-Gurage languages than with other Gurage ‘dialects’. Taken as a 
whole, South Ethiopic showed four primary foci orclusters which collectively resembled 
a circular chain; percentages of presumed cognation decreased or increased along a 
circular geographical pattern. This was markedly at variance with the few detailed 
sub-classifications of South Ethiopic in the literature. 

Subsequently, four separate confirmations of the lexicostatistical conclusions 
occurred. First, using the criterion of shared lexical innovations and involving primari- 
ly known borrowings from Greek and several clusters of Cushitic, I concluded that 
Geez could not be ancestral to the entire Ethiopic class, that Amharic had borrowed 
heavily from North Ethiopic, that the North-South Ethiopic split was strongly 
indicated, and that so-called Gurage consisted of parts of three distinct links in a 
circular chain. Second, using morphological evidence, A. Murtonen (1967) concluded 
that North and South Ethiopic were genetically distinct and that Geez was ancestral 
only to North Ethiopic. Third, R. Hetzron (1972), using exclusively the criterion 
of shared morphological innovation against a background of substantial phonetic 
reconstruction, concluded that Geez was not ancestral to South Ethiopic, that North 
and South Ethiopic were genetically distinct, and that the so-called Gurage cluster 
represented parts of three distinct sub-classes of South Ethiopic. Fourth, Marvin 
Bender (1971), using lexicostatistics and a computer, came to virtually the same sub- 
classes of Ethiopic that I had. 

My inquiry? into the Cushitic class showed that the basic division into five sub- 
classes was very clearly marked. Languages within one of the received branches 
always shared at least 5 percentage points more (usually 10 percentage points or more) 
with any co-member of a branch than with any outsider, while comparisons between 
branches tended to fall in the 10-15% range. Exceptions to these conclusions involv- 


1 Matrix tables of percentages of similarities/retentions are not offered here for either Ethiopic or 
Semitic. My purpose in omitting them is to avoid redundancy in publication. Marvin Bender’s 
matrix tables agree closely with mine, and are being published in two places (Bender 1971, 1973), so 
that no useful purpose is seen in adding my matrix tables here. My percentages tend towards higher 
figures than his because I scored doubtful cases as half-cognate while he scored each item as either 
cognate or non-cognate. In a few cases our percentages disagree rather more sharply, probably 
because we used slightly different lists and because of more differences than normal in the field data 
itself in a few cases. Differences in judgments of cognation also account for some of the disparities. 

2 General conclusions summed up in Fleming 1969. 


SUB-CLASSIFICATION IN HAMITO-SEMITIC 87 


ed West Cushitic languages or the geographical extremes (Egypt and Tanzania). It 
was noteworthy that West Cushitic languages usually shared less than 10% with 
outside languages and sometimes only around 10% among themselves. Salient ex- 
ceptions to these figures occurred when the West Cushitic Ometo group was com- 
pared with the adjacent Burji-Sidamo group of East Cushitic. In these comparisons 
percentages of retention were much higher on the average (around 15%). This seemed 
to be at variance with the received hypothesis that lumped Burji-Sidamo together with 
the West Cushitic class under the name Sidama. 

The Cushitic figures looked very suspicious for several reasons. Most critically, 
the West Cushitic group is centrally located and in intimate contact with East Cushitic 
and Ethiopic Semitic. In addition West Cushitic figures seemed grossly inconsistent, 
with external comparisons sometimes exceeding internal ones, while the geographical 
extremes — Kafa in the north and Ari in the south — had extraordinarily low percentages 
of around 10%. Finally, the figures failed to show consistent patterns within East 
Cushitic. 

Therefore, I concluded that lexicostatistical analyses had failed to provide a coherent 
or plausible picture of Cushitic sub-divisions. So I abandoned the technique and 
began using more customary methods, particularly shared lexical innovations, 
morphological agreements, and phonetic reconstruction. Practically everything 
needed to be done in any case because for the 70 odd Cushitic languages practically no 
competent historical work had been done. The literature consisted overwhelmingly 
of assertions liberally sprinkled with random etymologies. An Italian linguist, M.M. 
Moreno (1940), had published a brief but careful review of morphological features 
which united and distinguished the branches of Cushitic. He had been troubled. by 
the divergence of West Cushitic from the others. Checking on and expanding his 
inquiry, I realized that West Cushitic was quite devoid of common Cushitic morpho- 
logical patterns. This analysis was also later confirmed by A. N. Tucker (1967) of 
London. West Cushitic phonology likewise tended to be divergent. 

It became clear in time that a number of common diagnostic Cushitic words were 
absent in West Cushitic, which on the other hand had a common set peculiar to itself. 
As reconstruction advanced, it became obvious that proto-West Cushitic would not 
fit into proto-Cushitic. Its pronoun patterns, verbal paradigms, and gender markers 
at least would be as dissimilar as those of proto-Semitic or archaic Egyptian, IF NOT 
MORE SO. 

At this point I returned to lexicostatistics, fortified by a better knowledge of the 
probable cognations. The results were generally the same as before, except that they 
were more consistent because several areas of heavy inter-borrowing were now clear. 
I therefore concluded that West Cushitic would have to be separated from Cushitic. 
To test this further, comparisons were extended to the Chadic and Berber families of 
Hamito-Semitic, and later to Semitic. Cushitic languages normally had about as much 
in common with other families as with West Cushitic which normally had less (with 
percentages falling close to ZERO). It was noted that comparisons between Chadic and 


88 HAROLD C. FLEMING 


Berber languages? or Egyptian consistently showed figures in the 8% to 15% range. 

Recently, M. Bender (1971) has confirmed the PATTERN of my results within Cushitic, 
although his percentages run consistently LOWER. I have now renamed West Cushitic 
as Omotic and classified it as a separate family of Hamito-Semitic.* Initial reactions 
from interested scholars suggest that the division into Cushitic and Omotic is a 
sound one which will be fruitful in producing more viable reconstructions within 
Hamito-Semitic as a whole. 

T have concluded that lexicostatistics is not merely a heuristic device to use in sub- 
classification. It is part of the proof. 


BIBLIOGRAPHY 


Bender, Marvin L. 
1971 “The Languages of Ethiopia. A New Lexicostatistical Classification and Some Problems 
of Diffusion”, Anthropological Linguistics (in press). 
Bender, Marvin L. and J. D. Bown, R. L. Cooper, C, A. Ferguson, et al. 
1973 Language in Ethiopia (Oxford University Press and Haile Selassie University Press) (in press), 
Fleming, Harold C. 
1969 “The Classification of West Cushitic within Hamito-Semitic’’, in Daniel F. McCall, 
N. R. Bennett, J. Butter, eds., Eastern African Histors (= Boston University Papers on 
Africa 3) (Frederick Praeger). 
Greenberg, Joseph H. 
1963 Languages of Africa (Indiana University). 
Hetzron, Robert 
1972 “Ethio-Semitic”, in M. L. Bender, et al., Language in Ethiopia. 
Moreno, Martino Mario 
1940 Manuale di Sidamo (Torino). 
Murtonen, A. 
1967 A Diachronological Inquiry into the Relationship of Ethiopic to the Other So-called South- 
East Semitic Languages (Leiden: E. J. Brill). 
Tucker, Archibald N. 
1967 “Fringe Cushitic: An Experiment in Typological Comparison”, BSOAS XXX, Pt. 3. 


3 This surprising finding was checked and re-checked by comparing Chadic representatives drawn 
from the most divergent sub-groups of Chadic, some of which were only a little closer to each other 
than to one of the fairly closely related Berber languages. I am convinced that further investigation 
will in fact overturn the traditional view that the families of Hamito-Semitic are equi-distant. 

4 To be published in Bender et al. 1973. 


